Tinni
Choudhury – t.choudhury@mdx.ac.uk
Neesha
Kodagoda – n.kodagoda@mdx.ac.uk
Phong
Nguyen (Student) – p.nguyen@mdx.ac.uk
Chris
Rooney – c.rooney@mdx.ac.uk
Simon
Attfield - s.attfield@mdx.ac.uk
Kai Xu
– k.xu@mdx.ac.uk
Yongjun
Zheng – y.zheng@mdx.ac.uk
William
Wong – w.wong@mdx.ac.uk
Raymond
Chen – r.chen@mdx.ac.uk
Glenford
Mapp – g.mapp@mdx.ac.uk
Louis
Slabbert – l.slabbert@mdx.ac.uk
Mahdi
Aiash – m.aiash@mdx.ac.uk
Aboubaker
Lasebae - a.lasebase@mdx.ac.uk
Student Team: No
·
Middlesex University’s Concern Level Assessment (CLA)
Rules: In order to determine what the various patterns within the data meant and
which of the patterns were anomalous, Middlesex University VAST 2012
team developed a system to calculate a Concern
Level based on consultation with domain experts within the submission
team. Concern Level is based on the parameters of machine type (class and
function), policy status, activity flag, number of connection (statistically
determined to be normal or abnormally high based on for given class and
function of machine) and time of day (work hours or after hours). The Concern Level calculated based on a set
of 97 inference rules developed with the aid of experts and is on a scale of 0
(of no concern, e.g. the machine has activity flag 1, policy status 1 and has a
number of connection that is within the mean + 1 standard deviation for that
machine class for that time of day) to 5 (of high concern, e.g. for machine
that have a virus or, for a more complex example, a machine that has abnormally
high level of connections for that machine type and is suffering from 100% CPU
consumption and it is afterhours). In addition, we have a special concern alert
of 6 for policy status 5 and activity 5 as this indicates that a virus infected
machine has an external device added and this is of extra concern as it may
result in the infection spreading.
Video:
Answers to Mini-Challenge 1 Questions:
MC 1.1 Create a visualization of the health and policy status
of the entire Bank of Money enterprise as of 2 pm BMT (BankWorld Mean Time) on
February 2. What areas of concern do you observe?
We began by
viewing the data using the two views offered by M-Shieve, see Figure 1
and Figure 2.
Figure 1: M-Shieve Spatial View
Figure 2: M-Shieve Concern Level Assessment (CLA)
We note the following:
·
1 machines at CLA
HIGH
o Compute Server 172.2.194.20 in HQ
datacenter-2, Region 36 rated HIGH due to virus infection.
·
811 machines at
CLA MEDIUM HIGH
o
For some machines this is a due to policy status.
o
For some machines this is due to activity flag and/or the number of
connections for the machine type. For example, Web Server 172.8.28.77 has
activity flag 4 and a statistically high number of connections for a web
server. This can indicate a denial of service attack.
·
12,065 machines
at CLA MEDIUM
o For the majority this is due to
policy a status of 3.
o For the remainder this is due to a
combination of activity level, connection numbers and machine type. For
example, some servers in the Region 10 Headquarters are rated CLA MEDIUM due to
the number of login failures. These are expected for workstations but not for
servers.
·
Figure 3 shows the
distribution of CLA values for Region 10 HQ machines in the M-Shieve CLA view.
We note that all have some level of concern, with the majority rated MEDIUM
LOW. We determine that the facility as a whole is a cause for concern.
Figure 3: CLA View For Region 10 HQ
·
The majority of the LOW
and MEDIUM LOW concern levels were due to machines with a policy status other
then 1. These concerns are also flagged for high number of connections, login
failures, and high CPU consumption on specific machines types.
MC 1.2 Use your visualization tools to look at how the network’s
status changes over time. Highlight up to five potential anomalies in the
network and provide a visualization of each. When did each anomaly begin and
end? What might be an explanation of each anomaly?
·
2nd February, 12.45:00 BMT - The BoM network was infected with a virus for the first
time. The infection was contracted by Compute Server 172.2.194.20, Datacenter
2, Region 36 (see figure 4). At the start of the dataset it had policy status
2. This slowly escalated over the subsequent four and a half hours until the
machine was designated as having a virus. The infection may have been network
born, as no external device was added prior to the machine becoming infection
in the data we have. If the infection was network born, the policy-deviated
status of the machine would have made it more vulnerable to an attack.
Alternatively, the machine may have been infected at the beginning of the
dataset and the steady escalation of the policy status from 2 to 5 was a
symptom of this infection. We cannot be certain since at no time within the
dataset does the machine have a normal policy status. The machine remained
infected until the end of the dataset.
Figure 4: Region 36, Datacenter 2 showing a
virus infected Compute Server
·
2nd February, 15:45:00 BMT - A second machine was infected—Teller Workstation
172.41.188.35, branch 30, region 26. The local time was 12:45 PM. The dataset
starts at 3.15 AM local time and Workstation 172.41.188.35 is “on”
afterhours —a deviation from BoM business rules. But its Policy Status
did not escalate until work began when it rose to 2 at 8.15AM local time and
its policy status steadily increased to 5. It remained infected until the end
of the dataset. Given that the escalation in policy did not begin until the
start of work, this machine may have contracted the virus from communicating
with Server 172.2.194.20.
·
2nd February, 16:00:00 BMT - File Server 172.1.243.7, Datacenter 1, Region 15 became
infected and remained so until the end of the dataset. This server started as
normal and became policy deviated at 09:00:00 BMT after which the policy status
escalated over the following 7 hours until it was flagged as virus infected. It
is possible that this server communicated with either Workstation 172.41.188.35
or Server 172.2.194.20 and the slow escalation of the policy status is a
signature of the contracted virus.
·
2nd February 16:15:00 BMT - ATM 172.19.176.7 at Branch 117, Region 5, became infected
and remained so until the end of the dataset. Like Compute Server 172.2.194.20,
ATM 172.19.176.7 did not have Policy Status 1 within the scope of the dataset.
It also experienced a slow escalation in policy status and did not have any
physical devices added immediately prior to infection.
At the same time Loans Workstation 172.26.215.30 also became
infected,
bringing the total number of infected machines to five. It should also be noted that at this time both Server
172.2.194.20 and 172.1.243.7 showed abnormally high levels of connection.
·
On the 4th of February 08:00:00 BMT, the BoM networks has 5,963 infected machines from a
starting total of 0, with the virus spread being most pronounced
afterhours (see figure 5).
Figure 5: How the virus spread over time
·
Workstations across the
network remained on afterhours despite BoM’s business rule encouraging
employees to turn off their workstations. However, no obvious anomalous
activities, such as failed log in attempts, 100% CPU consumption or attachment
of external devices, took place afterhours. Over the course of the dataset, the
workstations left on overnight showed increasing levels of activity. This is
seen most clearly in the middle picture of Figure 6 —a snapshot from
08:15:00 BMT on the 02-03-2012. This can be contrasted with the snapshot from
08:15:00 on the 02-02.2012 when only a handful of machines had high number of
connections. By the end of the dataset, the connections had stabilized but the
health of the workstations left on had deteriorated, as it can be seen from
Figure 6. It is very likely that the workstations that remained on aided in the
spread of the virus that infected the BoM network and the high nighttime
connectivity is symptoms of the virus attempting to spread itself.
Figure 6: Shows how work stations left
switched on during the night gets deviated from policy and gets virus infected.
·
2nd of February, 08:15:00 BMT - Region 5 and 10 are particularly unhealthy (see Figure
7). All machines in Region 5 begin
with policy status 2. This escalates until the end of the dataset when the
majority of the machines had policy status 4 or 5 and nothing less than 3. This anomaly is probably due to a lack
of maintenance as machine that were flagged as going down for maintenance,
routinely returned with the same policy status as before.
Region 10 Region 5
Figure 7: Region 10 and Region 5 Policy
Status 2 Green
·
In region 10, only three machines had a normal
policy status. These were in Datacenter 5.
While the rest of the region got sicker, Datacenter 5 actually improved,
ending with 26.9% of the machines left on having a normal policy status. This is
anomalous given the health of the rest of the region.
Region 10
Figure 8: Showing Region 10 with Datacenter
5 having 3 machines with policy status 1.
We can also clearly see two abnormal connections.
·
Additional anomalous behavior in Region 10
involves teller workstations. At the start of the dataset, two tellers
(172.30.60.42 and 172.30.60.40) had abnormally high connections, (see Figure
8). The mean for teller machines is 9.92. An hour later, eighteen Teller
machines developed abnormal connectivity levels ranging from 40 to 98. This
pattern of high connectivity continues until office hours begin at 12:15:00 BMT
(see Figure 9), at which point the high number of connections subsides.
However, the abnormally high network activity begins anew on 02-03-2012 at
08:15:00 BMT and ends at 12:15:00 BMT (see Figure 10). This suggests that the
teller machines of this region may be part of a bot-net with pre-defined activity
times.
Figure 9: Showing Region 10 abnormal teller
activity returning to normal on the 02-02-2012
Abnormal connections
Figure 10: Showing Region 10 abnormal teller
activity starting and stopping on the 02-03-2012